A Study on L2-Loss (Squared Hinge-Loss) Multiclass SVM
نویسندگان
چکیده
Crammer and Singer's method is one of the most popular multiclass support vector machines (SVMs). It considers L1 loss (hinge loss) in a complicated optimization problem. In SVM, squared hinge loss (L2 loss) is a common alternative to L1 loss, but surprisingly we have not seen any paper studying the details of Crammer and Singer's method using L2 loss. In this letter, we conduct a thorough investigation. We show that the derivation is not trivial and has some subtle differences from the L1 case. Details provided in this work can be a useful reference for those who intend to use Crammer and Singer's method with L2 loss. They do not need a tedious process to derive everything by themselves. Furthermore, we present some new results on and discussion of both L1- and L2-loss formulations.
منابع مشابه
The Lq Support Vector Machine
The standard Support Vector Machine (SVM) minimizes the hinge loss function subject to the L2 penalty or the roughness penalty. Recently, the L1 SVM was suggested for variable selection by producing sparse solutions [BM, ZHRT]. These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perform well only in a certain type of situat...
متن کاملMulticlass Boosting with Hinge Loss based on Output Coding
Multiclass classification is an important and fundamental problem in machine learning. A popular family of multiclass classification methods belongs to reducing multiclass to binary based on output coding. Several multiclass boosting algorithms have been proposed to learn the coding matrix and the associated binary classifiers in a problemdependent way. These algorithms can be unified under a s...
متن کاملL1 and L2 regularization for multiclass hinge loss models
This paper investigates the relationship between the loss function, the type of regularization, and the resulting model sparsity of discriminatively-trained multiclass linear models. The effects on sparsity of optimizing log loss are straightforward: L2 regularization produces very dense models while L1 regularization produces much sparser models. However, optimizing hinge loss yields more nuan...
متن کاملSupport vector machines with adaptive Lq penalty
The standard Support Vector Machine (SVM) minimizes the hinge loss function subject to the L2 penalty or the roughness penalty. Recently, the L1 SVM was suggested for variable selection by producing sparse solutions (Bradley and Mangasarian, 1998; Zhu et al., 2003). These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perfor...
متن کاملStochastic functional descent for learning Support Vector Machines
We present a novel method for learning Support Vector Machines (SVMs) in the online setting. Our method is generally applicable in that it handles the online learning of the binary, multiclass, and structural SVMs in a unified view. The SVM learning problem consists of optimizing a convex objective function that is composed of two parts: the hinge loss and quadratic (L2) regularization. To date...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neural computation
دوره 25 5 شماره
صفحات -
تاریخ انتشار 2013